Ranking WebPages Using Web Structure Mining Concepts
نویسنده
چکیده
With the rapid growth of the Web, users get easily lost in the rich hyper structure on the web. Providing relevant information to the users to supply to their needs is the primary goal of the owners of these websites. Web mining is one of the techniques that could help the websites owner in this direction. Web mining was categorized into three categories such as web content mining, web usage mining and web structure mining. Web structure mining plays an important role in this approach. Two page ranking algorithms such as PageRank and Hyperlink-Induced Topic Search (HITS) are commonly used in web structure mining. Both algorithms treat all links equally when distributing rank scores. A comparison between both algorithms was discussed in this paper as well. Ranking WebPages is an important mission as it assists the user look for highly ranked pages that are relevant to the query. Different metrics have been proposed to rank web pages according to their quality, and a brief discussion of the two prominent ones was conducted in this paper also. Key-Words: Web Mining, Web Content Mining, Web Usage Mining, Web Structure Mining, HITS, PageRank, Authority and Hubs.
منابع مشابه
Link Analysis: Hubs and Authorities on the World Wide Web
Ranking the tens of thousands of retrieved webpages for a user query on a Web search engine such that the most informative webpages are on the top is a key information retrieval technology. A popular ranking algorithm is the HITS algorithm of Kleinberg. It explores the reinforcing interplay between authority and hub webpages on a particular topic by taking into account the structure of the web ...
متن کاملOptimal ranking in networks with community structure
The World-Wide Web (WWW) with its enormous size (~ 10 webpages) presents a challenge for efficient information retrieval and ranking. By effectively utilizing the topological information to rank the webpages, Google became the most popular tool on the web. One important feature of the WWW is that it exhibits a strong community structure in which groups of webpages (e.g. those devoted to a commo...
متن کاملA Synonym Based Approach of Data Mining in Search Engine Optimization
In today’s era with the rapid growth of information on the web, makes users turn to search engines as a replacement of traditional media. This makes sorting of particular information through billions of webpages and displaying the relevant data makes the task tough for the search engine. Remedy for this is SEO (Search Engine Optimization), i.e. having a website optimized in such a way that it w...
متن کاملRanking National Football League teams using Google's PageRank
The search engine Google uses the PageRanks of webpages to determine the order in which they are displayed as the result of a web search. In this work we expand Google’s idea of webpage ranking to ranking National Football League teams. We think of the teams as webpages and use the statistics of a football season to create the connections (links) between the teams. The objective is to be able t...
متن کاملMapReduce Based Information Retrieval Algorithms for Efficient Ranking of Webpages
In this paper, the authors discuss the MapReduce implementation of crawler, indexer and ranking algorithms in search engines. The proposed algorithms are used in search engines to retrieve results from the World Wide Web. A crawler and an indexer in a MapReduce environment are used to improve the speed of crawling and indexing. The proposed ranking algorithm is an iterative method that makes us...
متن کامل